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Abstract 

One of the most relevant problems in the extraction 
of scientifically useful information from wide field as- 
tronomical images (both photographic plates and CCD 
frames) is the recognition of the objects against a noisy 
background and their classification in unresolved (star- 
like) and resolved (galaxies) sources. In this paper we 
present a neural network based method capable to per- 
form both tasks and discuss in detail the performance of 
object detection in a representative celestial field. The 
performance of our method is compared to that of other 
methodologies often used within the astronomical com- 
munity. 



1. Introduction 

Astronomical wide field imaging (hereafter WFI) and 
its most extreme case, all sky surveys such as the Palo- 
mar Sky Surveys (POSS I & II), are the main tools 
to tackle astronomical problems requiring statistically 
significant samples of optically selected objects. In the 
past, WFI has also been the main supplier of targets 
for photometric and spectroscopic follow-up's at tele- 
scopes of the 4 meter class. The exploitation of the 
new generation telescopes in the 8 meter class, which 
are mainly aimed to observe targets which are too faint 
to be detected on photographic material (the POSS-II 
detection limit in B is '^ 21.5 mag) requires new digi- 
tised surveys realized with large format CCD detectors 
mounted at 2 meter class dedicated telescopes. Much 
effort is currently devoted worldwide to construct such 
facilities: the MEGACAM project at the CFH, the ESQ 
Wide Field Imager at the 2.2 meter telescope, the Sloan 
- DSS and the ESO/OAC VST, to quote only some of 
the ongoing or planned experiments. One aspect which 
is never too often stressed is the humongous problem 



posed by the handling, processing and archiving of the 
data produced by these instruments: the VST, for in- 
stance l^l is expected to produce a flow of almost 20 
GByte of data per night or 10 Tbyte per year of oper- 
ation. Such a huge flow of data cannot be effectively 
dealt with traditional data reduction packages and calls 
for modern A.I. based approaches. 

In this paper we present a new, neural network (NN) 
based method, capable to perform object detection and 
star/galaxy separation. Due to space limitation we 
shall focus our attention mainly on the experimental 
results relative to the first step. 



2. Preprocessing and object detection 

After the standard preprocessing of the data 0| we per- 
form the following steps: 

- we first run a 3x3 or 5x5 window on the image in 
order to determine the value of the central pixel; 

- we then use Robust Principal Component Analysis 
(PCA) NNs to reduce to 3 the dimensionality of the 
input space. 

- Therefore, since supervised NN's need a large am- 
ount of labeled data to obtain a good classification, we 
use unsupervised NN's to segment the pixels into six 
classes (one for the backround and five for the objects). 

- We then group the five objects classes into one and 
are left with two classes only: background and objects. 

- Finally, in order to split overlapping objects, we 
run a simple but effective deblending algorithm, capa- 
ble to isolate the objects against the noisy background. 

2.1 Preprocessing and object detection 

PCAs can be neurally realized in various ways; we used 
a feedforward neural network with only one layer which 



is able to extract the principal components of the stream 
of input vectors. The structure of the PCA NN can be 
summarized as follows: there is one input layer, and 
one forward layer of neurons totally connected to the 
inputs; during the learning phase there are feedback 
links among neurons, that classify the network struc- 
ture as either hierarchical or symmetric, depending on 
the feedback connections of the output layer neurons. 
Typically, Hebbian type learning rules are used. Many 
different versions and extensions of the basic learning 
algorithm have been proposed in recent years p6| , pO| , 
M. After the learning phase, the network becomes 
purely feedforward. M proved that PCA neural al- 
gorithms can be derived from optimization problems, 
such as variance maximization and representation of 
error minimization, and derived the so called robust 
PCA algorithms and nonlinear PCA algorithms . More 
precisely, in the robust generalization of variance max- 
imization, the objective function f(z) is assumed to be 
a valid cost function [^ such as In cos{z) and \z\. This 
leads to the adaptation step of the learning algorithm: 



w 



where: 



(t+i) 



=(*) 



w 



it) 

3 1 



fj-g 



(»;") 



,(*) 
'ji 



(1) 



.(*) 



lU) 

E(t) (*) 

i=l 



9 = -r 



dz 



In the hierarchical case l{j) — j and in the symmet- 
ric case l{j) = M. The learning function g, derivative 
of /, is applied separately to each component of the ar- 
gument vector. In previous experiments [ p3[ we found 
that the hierarchical robust NN of eq.n^ with learning 
function g^*' — tanh{ax) performs better than all the 
other PCA NN's and linear PCA. 

2.2 Unsupervised NNs 

The NNs used in this section are based on the classi- 
cal unsupervised neural models: Kohonen Self Organiz- 
ing Maps [O, Neural-Gas |13], Growing Cell Structure 
(GCS) Q, on-line K- means clustering algorithm |12| , 
Maximum Entropy NN [|l9|. All these methods allow 
to partition the input space into clusters and to assign a 
weight vector corresponding to the template character- 
istic of a cluster in the input space to each neuron. As a 
consequence, after the learning, an input pattern is as- 
signed to the class corresponding to the nearest neuron. 

We preferred to reduce the well-known complexity of 



the post-processing labeling adding an unsupervised sin- 
gle layer NN to the output of the first layer NN. In this 
way, the second layer NN learns from the weights of the 
first layer NN and clusters the neurons on the basis of a 
similarity measure or a distance. The iteration of this 
process gives the unsupervised hierarchical NN's. The 
number of neurons at each layer decreases from the first 
to the output layer, and, as a consequence, the NN takes 
a pyramidal aspect. The NN takes as input a pattern 
X and then the first layer finds the winner neuron. The 
second layer takes the first layer winner weight vector 
as input and finds the second layer winner neuron and 
so on until the top layer. The activation value of the 
output layer neurons is 1 for the winner unit and for 
all the others. 

By varying the learning algorithms we obtain differ- 
ent NN's with different properties and abilities. For in- 
stance, by using only SOMs we have a Multi-layer SOM 
(ML-SOM) ||l^ where every layer is a two-dimensional 
grid. We can easily obtain ML-NeuralGas, ML-Maxim- 
um Entropy or ML-K means organized on a hierarchy 
of linear layers |21 . The ML-GCS has a more complex 



architecture and has at least 3 units for layer. By vary- 
ing the learning algorithms in the different layers we 
can take advantage from the properties of each model 
(for example since we cannot have a ML-GCS with 2 
output units, then we can use another NN in the out- 
put layer). A hierarchical NN with a number of output 
layer neurons equal to the number of the output classes 
simplifies the expensive post-processing step of label- 
ing the output neurons in classes, without reducing the 
generalization capacity of the NN. 



3. Star/Galaxy separation 

The first step in order to perform star/galaxy separa- 
tion is to identify the most significant features. Then 
we run an optimized Multi-Layer Perceptron (MLP). [H 
and ||l7| summarize methods to overcome the problems 
related to local minima and slow time convergence of 
the above algorithm. 

The object features were chosen following the litera- 
ture Isl, [l3, |15|, and selected by a simple sequential 
forward selection process B, so as to select the most 
performing ones. In particular, we took in considera- 
tion the following features: 

• Six features describing the ellipses circumscribing 
the objects: the photometric baricenter coordi- 
nates, the isophotal fiux, the semimajor axis, the 
semiminor axis, the position angle, and the object 
area {A) in pxls. 



Twelve features suggested by Odewahn ||l5||: the 
object diameter, the ellipticity, the average sur- 
face brightness((S'Mi3r)), the central intensity (/q), 
the filling factor, the area logarithm, the har- 
monic radius and five simple gradients of the light 
distribution G14, G13, G12, G23 and G34 defined 
as: 

r -IlzIi 
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where Ti is the average surface brightness within an 
ellipse, with position angle a, semimajor axis Vi < a 
and ellipticity ell. To this aim, four equidistant radii r, 
are selected with r.i = i a/4, i — 1, . . . ,A. 

• Two more features are taken from Miller |Q : the 
two ratios Tr — (SuBr) /Iq and Tca ~ -^o/v^- 

• Finally, five features from FOCAS M: the second 
and the fourth total moments of the light distri- 
bution, the central intensity averaged in a 3 x 3 
area, the ellipticity averaged over the whole ob- 
ject area and, finally, the "Kron" radius defined 
as: 
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In order to optimize the classification system perfor- 
mance it is necessary to reduce the feature number. To 
do so we need training and test sets for a subset of our 
objects. In our case we selected a subset of the Infante 
and Pritchet catalog JT), Q built with deeper images 
obtained under sub-arcsec seeing conditions. We ex- 
perimented both unsupervised and supervised NN's for 
both the feature selection and the classification phases, 
but since unsupervised NN's did not reach appreciable 
results, in this paper we present only result with MLP's. 

The sequential backward elimination strategy Q works 
as follows: let us suppose to initially have all M features 
in one set and to run the NN's with this set. Then, we 
build M different sets with M — 1 features each one and 
we run one NN for each set and take the set obtaining 
the best classification, in this way eliminating the less 
significant feature. Usually, after this first step the clas- 
sification error decreases if there are noisy or redundant 
features. Then, we repeat these steps eliminating one 
feature at each step. 

For what concerns supervised learning NN's, we used 
some MLP's [Q with one hidden layer of 20 neurons and 
only one output, assuming value for star and value 1 
for galaxy. After the training, we calculate the NN out- 
put as 1 if it is greater than 0.5 and otherwise for each 



pattern of the test set. The most performing learning 
algorithm was a hybrid conjugate gradients-quasi New- 
ton method to take advantage of both the algorithms. 

4. Experimental Results 

4.1 The data 

In order to test the performances of our method we used 
a 2000x2000 arcsec^ area centered on the North Galac- 
tic Pole extracted from the slighly compressed POSS-II 
F plate n. 443 (available via network at the CADC). 
POSS-II data were Hnearized using the sensitometric 
spots recorded on the plate. The average FWHM of 
our data was 3 arcsec. The same area has been widely 
studied by others and, in particular, by 0, Q who 
used deep observation obtained at the 3.6 m CFHT 
telescope in the F photographic band under good seeing 
conditions (FWHM < 1 arcsec), to derive a catalogue 
of objects complete down to mp ^ 23. Their catalogue 
is therefore based on data of much better quality and 
accuracy than ours. 

The selected region, a relatively empty one, slightly pe- 
nalizes our NN detection algorithms which easily recog- 
nize objects of quite different sizes and - on the contrary 
of what happens to other algorithms - work well even 
on very crowded area, such as the center of nearby clus- 
ters of galaxies, as our preliminary test on a portion of 
the Coma clusters (imaged on the same POSS-II plate) 
shows [ p2| . 

4.2 The processing 

This POSS-II field was processed through several NN 
detection algorithms (PCA NN's, Hierrchical Unsuper- 
vised NN's, MLP's) and also through S-Extractor (=SE- 
x; 1^) which has come to be a standard in the astro- 
nomical community. For what the SEx application to 
our dataset is concerned refer to Q]. 

For the NN's, we used the PCA NN's to reduce the 
input space to 3 dimensions. Then we run the unsuper- 
vised NN's on the 3-D input related to the 5x5 and 
3x3 running windows (in our experiments the best 
performing NN's were: Neural gas (NG3), ML-Neural 
gas (MLNG3 or MLNG5), ML-SOM (K5), GCS-fML- 
Neural gas (NGCS5). We just wish to stress here that, 
since the background subtraction is a vital part of the 
detection, and in order not to give an unfair advan- 
tage to any of the detections algorithms, all algorithms 
including SEx, were run on the same background sub- 
tracted image. 

Fig. 1 gives the number of "True" objects detected by 



SEx (upper panel), id est objects having a counterpart 
in the [Q catalog. As it can be seen, the SEx cata- 
log is uncomplete for mp < 21 mag, which is roughly 
the plate completness limit. The lower panel shows 
instead the relative performance of the NN's, defined 
as the ratio between the number of "True" objects de- 
tected by the specific NN and SEx, respectively. All 
the NN's and SEx turn out to be roughly equivalent in 
detecting "True" objects brighter than m^? — 21, while 
for objects fainter than the completeness limit of the 
plate, only MLNG5 is as efhcient as SEx, followed by 
MLNG3. Therefore, differences among catalogs con- 
cern only galaxies fainter than the plate completeness 
limit. 

Fig. 2 shows the number of "False" objects detected 
by SEx (upper panel), where "False" means objects 
not having a counterpart in the ||7| catalog, and there- 
fore include a few "True" objects not catalogued by 
(mainly because they are too bright). We believe 
that all objects brighter than mp — 20 mag are really 
"True" since they are detected both by SEx and NN's 
with high significance. The lower panel shows the rel- 
ative performances of the NN's, defined as the ratio of 
the number of "False" objects detected by the NN and 
by SEx. For objects brighter than mp = 19 mag, NN's 
and SEx have similar performances, while at mp — 19.7 
mag, SEx works better (but only for a few objects, see 
upper panel). NN's catalogues present, however, less 
false detections. MLNG5, which is also quite efficient 
in detecting "True" objects, has a 20% cleaner detec- 
tion rate in the highly populous bin m,p = 21.7 mag. 
MLNG3 is less efficient in detecting "True" objects but 
is even cleaner of false detections. 

Fig. 3 shows the number of missed objects by SEx (up- 
per panel). "Missed" means being in the catalog, 
but not included in our catalogs. Obviously, the step 
increase below 21 mag coincides with the completeness 
limit of our photographic material. The lower panel 
gives the relative performances of the NN's, defined as 
the ratio between the number of objects missed by the 
specific NN and by SEx. MLNG3 and MLNG5 have 
performances almost constant at ^^ 1 mag, while the 
other NN's miss objects at mp ~ 21 — 22 mag which, 
however, are still fainter than the plate completeness 
limit. 

The class of "Missed" objects needs more attention. It 
is likely that most of the objects fainter than mp = 21 
mag are too faint to be detected with a 100% confidence 
level, so we focus first on brighter objects. They can be 
divided in: 

- objects detected by Ul which correspond to empty 



regions in our images. They can be missing because 
variable, fast moving, or with an overestimated lumi- 
nosity in . They can also be missed because spurious 
in the template catalog or simply because they are too 
faint. 

- "True" , nearby objects which are blended in our 
image but not in that of |0]; 

~ parts of isolated single large objects incorrectly 
split by §; 

- a few detections aligned in the E-W direction on 
the two sides of the images of a bright star. They are 
likely false objects (diffraction spikes detected as indi- 
vidual objects). 

Therefore, a fair fraction of the "Missed" objects are 
truly non existent and the performances of our detec- 
tion tools are therefore lower bounded at mp < 21 mag. 
We wish to stress here that even though there is nothing 
like a perfect catalogue, the template by is among 
the best ones ever produced to our best knowledge. 

In 0, objects are classified in 2 major classes, star & 
galaxies, and a few minor classes (merged, noise, spike, 
defects, etc.), that we neglect. The efficiency of the 
detection is shown in Fig. 4 for three representative de- 
tection algoritms: MLNG5, K5, and SEx. At mp < 21 
mag, the detection efficiency is large, close to 1 and 
independent on the central concentration of the light. 
Please note that there are no objects in the image hav- 
ing mp < 16 mag and that in the following bin there 
are only 4 galaxies. At fainter magnitudes (~ 22 — 23 
mag) detection efficiencies differ as a function of both 
the algorithm and of the light concentration. In fact, 
SEx, MLNG5, and to less extent K5, turn out to be 
more efficient in detecting galaxies rather than stars (in 
other words: "Missed" objects are preferentially stars). 
For SEx, a possible explanation is that a minimal area 
above the background is required in order for the object 
to be detected. At mp ~ 22 — 23 mag, noise fluctua- 
tions can affect the isophotal area of unresolved objects 
bringing it below the assumed treshold (4 pixels). This 
bias is minimum among the three considered detection 
algoritms, for the K5 NN. However, this is more likely 
due to the fact that K5 misses more galaxies than the 
other algorithms, rather than to the fact that it detects 
more stars. 



5. Concluding Remarks 

In conclusion: MLNG3 and MLNG5 turn out to have 
performances similar to SEx in detecting objects: they 
produce catalogs which are cleaner of false detections 
but, at the same time, are also slightly more uncom- 
plete than SEx. 
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Figure 1: Number of "True" objects detected by SEx 
(upper panel); relative performance of the NN's, de- 
fined as the ratio of the number of true objects detected 
by the NN and by SEx, respectively (lower panel). 



Figure 3: Number of objects "Missed" by SEx (upper 
panel); relative performance of the NN's, defined as the 
ratio of the number of objects missed by the NN and 
by SEx (lower panel). 
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Figure 2: Number of false objects detected by SEx (up- 
per panel); relative performance of the NNs, defined as 
the ratio of the number of "False" objects detected by 
the NN and by SEx (lower panel). 




Figure 4: Percent number of detected objects by 
MLNG5, K5 and SEx. 



We also want to stress that since the less performing 
NN's produce catalogs which are much cleaner of false 
detections, they can be used to select candidates for 
possible follow-up detailed studies at magnitudes where 
many of the objects detected by SEx would be false (i.e. 
the selected objects would be in large part true, and not 
just noise fluctuations). 

A posteriori, one could argue that performances similar 
to those of each of the NN's could be achieved by run- 
ning SEx with appropriate settings. However, it would 
be unfair (and methodologically wrong) to make a fine 
tuning of any of the detection algorithms using a pos- 
teriori knowledge. 
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